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Abstract 

Background: Traditional Chinese medicine uses various herbs for the treatment of various diseases for thousands 
of years and it is now time to assess the characteristics and effectiveness of these medicinal plants based on 
modern genetic and molecular tools. The herb Flos Lonicerae Japonicae (FU or Lonicera japonica Thunb.) is used as 
an anti-inflammatory agent but the chemical quality of FU and its medicinal efficacy has not been consistent. Here, 
we analyzed the transcriptomes and metabolic pathways to evaluate the active medicinal compounds in FU and 
hope that this approach can be used for a variety of medicinal herbs in the future. 

Results: We assess transcriptomic differences between FU and L japonica Thunb. var. chinensis (Watts) (rFU), 
which may explain the variable medicinal effects. We acquired transcriptomic data (over 100 million reads) from the 
two herbs, using RNA-seq method and the lllumina GAII platform. The transcriptomic profiles contain over 6,000 
expressed sequence tags (ESTs) for each of the three flower development stages from FU, as well as comparable 
amount of ESTs from the rFU flower bud. To elucidate enzymatic divergence on biosynthetic pathways between 
the two varieties, we correlated genes and their expression profiles to known metabolic activities involving the 
relevant active compounds, including phenolic acids, flavonoids, terpenoids, and fatty acids. We also analyzed the 
diversification of genes that process the active compounds to distinguish orthologs and paralogs together with the 
pathways concerning biosynthesis of phenolic acid and its connections with other related pathways. 

Conclusions: Our study provides both an initial description of gene expression profiles in flowers of FU and its 
counterfeit rFU and the enzyme pool that can be used to evaluate FU quality. Detailed molecular-level analyses 
allow us to decipher the relationship between metabolic pathways involved in processing active medicinal 
compounds and gene expressions of their processing enzymes. Our evolutionary analysis revealed specific 
functional divergence of orthologs and paralogs, which lead to variation in gene functions that govern the profile 
of active compounds. 
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Background 

Flos Lonicerae Japonicae (FLJ, Lonicera japonica Thunb.) 
is used as a herbal medicine with anti-inflammatory ef- 
fect [1]. The first record in the literature on this herb is 
found in one of the world earliest pharmacopoeias, the 
Shen-Nongs Herbals. The commercial value of FLJ in 
herbal medicine trading markets has increased over 
400% in recent years, and over 30% of current traditional 
Chinese medicine prescriptions contain FLJ. This im- 
portant herb is used to treat various diseases, such as se- 
vere acute respiratory syndromes, H1N1 influenza, and 
hand-foot-and-mouth disease. FLJ extracts also show 
other biological and pharmaceutical properties, includ- 
ing, anti-bacterial, anti-inflammatory, anti-viral, liver 
protection [2], anti-angiogenic, and antinociceptive ac- 
tivities [3]. However, the quality of FLJ as a medicinal 
herb is rather inconsistent and largely due to its unchar- 
acterized active compound content. 

Chlorogenic acid and luteoloside are biomarkers used 
by the Chinese Pharmacopoeia (Chinese Pharmacopoeia 
Commission, 2010) for evaluating the quality of FLJ. 
Pharmacological experiments show that luteolin has a 
spectrum of biological activities, particularly antioxida- 
tive and anti-inflammatory properties. Luteolin has a dir- 
ect inhibitory effect on lung fibrosis [4]. Other phenolic 
compounds, including phenolic acids, have been identi- 
fied in the methanolic extract of FLJ by liquid chroma- 
tography with time-of-flight mass spectrometry [5]. 
Among these compounds, loganin and sweroside also 
exhibit anti-inflammatory and analgesic activities, which 
are candidate active components of the FLJ extract [6]. 
Rutin is demonstrated to reduce oxidative stress- 
mediated myocardial damage in both in vitro and in vivo 
models and may prove beneficial in the treatment of 
myocardial infarction [7]. 

FLJ has other commercial applications, such as scent 
additive used in foods and cosmetics [8]. The main 
volatile component is linalool, but other floral volatile 
compounds, including germacrene D, ds-jasmone, E,E- 
a-farnesene, nerolidol, ds-3-hexenyl acetate, hexyl acet- 
ate, ds-hexenyl tiglate, and indole, have been detected 
based on headspace analysis and gas chromatography- 
mass spectrometry (GC-MS) [9]. 

The concentrations of active components and volatile 
compounds are closely correlated with floral develop- 
mental stages [10]. The content is higher in the early 
stages of cell differentiation status and the flower buds 
often show the highest medical value, whereas the 
chlorogenic acid content is significantly reduced in 
blooming flowers. However, the budding period is both 
short and not synchronized among individual plants, 
making it problematic for mass harvest. 

A limited number of publications have assessed the re- 
lationship between the abundance of active compounds 



and floral development of FLJ based on molecular bio- 
logical techniques. One of the studied showed that 
hydroxycinnamoyl-CoA quinate hydroxycinnamoyl trans- 
ferase (HQT) gene, encoding a protein of 439 amino 
acids and identified in FLJ, has a tissue distribution that 
correlated with the pattern of chlorogenic acid abundance 
[11]. Another study cloned multi-copied allene oxide syn- 
thase LjAOS (GenBank accession: DQ303120) from FLJ 
and demonstrated that LjAOS mRNA is most abundant 
in flower buds, and its expression correlates with the con- 
centration of chlorogenic acid [12]. 

The content of active compounds also differs signifi- 
cantly among the species and varieties [13] of the Loni- 
cerae family. L. japonica Thunb. var. chinensis (Watts) 
(rFLJ) is a Chinese local variety and its corolla has pur- 
ple outer (upper) and white inner (lower) portions and 
the whole flower has different active compound contents 
when compared with FLJ [14]. Qin et al. [15] reported 
different chlorogenic acid contents between FLJ and 
rFLJ. Changes of the active compound contents often re- 
sult in different pharmacological activity and medicinal 
quality [16]. However, a systematic study to compare 
gene expression and active compound variations in the 
active parts among varieties and closely related plant 
species is of essence. 

The high-throughput of the next generation RNA se- 
quencing (RNA-seq) technologies offers rapid genome- 
wide transcriptomic studies and is widely used to define 
gene structure and expression profiles in model organ- 
isms [17-19]. The assembly of de novo transcriptomes 
based on short reads generated from RNA-seq method 
allows gene discovery in organisms without reference 
genomes. In this study, we applied RNA-seq to the study 
of floral transcriptomes of FLJ and rFLJ. We generated 
over 100 million reads using the Alumina GAII platform, 
and analyzed over 6,000 expressed genes from each of 
the three flowering stages: bud, blooming, and bloomed. 
We correlated the transcriptomic profiles with metabolic 
activities of the relevant active compounds, including 
phenolic acids, flavonoids, terpenoids, and fatty acids, to 
elucidate effects of enzymatic divergence on biosynthetic 
pathways. 

Methods 

Plant materials 

Flower samples (corollas or all petals) were randomly col- 
lected five each from 3-year old FLJ and rFLJ in Doudian 
plantation (Beijing, China) (Figure 1). The flowering 
stages are: (1) the bud stage (white, FLJ and red, rFLJ) 
when the flower bud has not bloomed into a full-size 
flower yet; (2) the flower 1 stage when the white inner 
petals and white (FLJ) or red (rFLJ) outer petals has just 
bloomed into full-size flowers; and (3) the flower2 stage 
when the yellow inner petals and white (FLJ) or red (rFLJ) 
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Figure 1 Plant materials. (A) -(D), Flower samples: A, FU bud, had white petals and had not yet bloomed into a full-size flower; B, FU flowerl, 
had white petals and had bloomed into a full-size flower; C, FU flower2, had yellow petals and had bloomed into a full-size flower; D, rFU bud, 
had red petals and had not yet bloomed into a full-size flower. (E) -(H), Stereomicroscopic detection Trichome (red arrows) in the flower bud and 
leaf of FU and rFU. Abbreviations: E, FU flower bud; F, FU Leaf; G, rFU flower bud; and H, rFU leaf. 



outer petals bloomed into full-size flowers. We separated 
the samples into 2 groups; group 1 is used to compare 
the FLJ flower buds with its flowers from flowerl and 
flower2 stages, and group 2 is used to compare the 
flower buds between FLJ and rFLJ. Fresh samples were 
used for gas chromatography-mass spectrometry (GC- 
MS), and freeze-dried flowers were used for HPLC. 
Quick-frozen flowers (in liquid nitrogen) were used for 
RNA extraction. 



contigs using ABySS http://www.bcgsc.ca/platform/ 
bioinfo/software/abyss; [20] . We used contigs longer than 
100 bp for further annotations. Since the genome se- 
quence of FLJ has not been available, we used BLAST 
[21] to align the contigs to the NCBI non-redundant se- 
quence database. Because V. vinifera full-length cDNA 
sequences provided the most annotations, we clustered 
the FLJ/rFLJ contigs in reference to the Vitis vinifera 
cDNA sequences. 



RNA isolation and sequence acquisition 

Total RNA was extracted from flower samples by using 
Concert Plant RNA Reagent (Invitrogen, Cat. 12322- 
012) according to the manufacturers protocol. RNA in- 
tegrity was measured by using gel electrophoresis and 
spectrophotometer (Nonodrop). An Oligotex-dT30 Super 
mRNA Purification Kit from TaKaRa (D9086) was used 
to extract mRNA. 

De novo sequence assembly and contig clustering 

Prior to assembly and mapping, we removed low quality 
reads (such as >30% "N" in a read and polyA tails) from 
the raw data and assembled the processed reads into 



Gene annotation and expression analysis 

We used BLASTX [21] to search against the NCBI non- 
redundant (nr) database to identify transcripts and anno- 
tated the transcripts using KEGG [22] and COG [23] 
with an E value cut-off of 10" 5 . We applied InterPro [24] 
and Blast2GO [25] to the annotation of protein motifs/ 
domains and Gene Ontology (GO) terms. GO annota- 
tion enrichment analyses were conducted based on a 
Benjamini and Hochberg false discovery rate correction 
with significance set at p < 0.05 by using the Cytoscape 
plug-in BiNGO[26]. 

We mapped the sequence reads and contigs using 
SOAP http://soap.genomics.org.cn/soapaligner.html; [27] 
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and handled isoforms/spliced variants with cautions 
[28]. We used sequence similarity information and the 
Vitis vinifera full-length cDNAs for transcriptome map- 
ping and tag counting using LASTZ [29] after clustering 
the contigs into ESTs. Only uniquely mapped reads were 
counted. The expression profiling was done by normaliz- 
ing the total mapped reads and contig length as RPKM 
(reads per kilobase of transcript sequence per million 
mapped reads; 19). The effective size was used to adjust 
RPKM values in subsequent analyses. DEGseql.2.2 [30] 
was used to roughly identify the differentially expressed 
genes (DEGs) via the p value and the RPKM fold-change 
value. The DEGs were further studied based on path- 
way/expression analyses and real-time PCR. 

Gas chromatography-mass spectrometry profiling 

The concentrations of ethanol, acetate, alkane, and ter- 
pene in the flower samples were determined based on 
gas chromatography-mass spectrometry (GC-MS) 
(Additional file 1: Figure SI). Fresh flower samples 
(1.00 g, n = 3) were washed twice with distilled water, 
subjected to ultrasonic extraction (90 Hz) with 10 ml 
ethyl acetate for 40 minutes, and filtered through a 
microfiltration membrane (0.45 um). Extracted metabo- 
lites were analyzed as follows: 1 ul of sample was injected 
at a split ratio of 10:1 into a Shimadzu GCMS-QP2010 
instrument. A VF-5MS capillary column coated with 5% 
phenyl and 95% dimethylpolysiloxane (30 m x 0.25 mm i. 
d. and 0.1 um film thickness; Varian, USA) was employed 
for separation. Injection temperature was 230°C and the 
interface temperature was set to 250°C. The ion source 
was adjusted to 230°C and the solvent cut-time was set 
to 3 minutes. Helium was the carrier gas at a flow-rate of 
1.05 ml minute" 1 . The temperature program was: an ini- 
tial temperature of 50°C, programmed at 5°C minutes" 1 
to 150°C and held for 10 minutes, then ramped at 10°C 
minute" 1 to 260°C and held for 20 minutes. The mass 
spectrometric detector was operated in the electron im- 
pact ionization mode with an ionizing energy of 70 eV, 
scanning from 40-400 m/z. Peak identification was per- 
formed by employing AMDIS (NIST, Gaithersburg, MD, 
USA) and WILEY7n (Palisade Corporation, NY, USA) 
databases with a spectral match quality >90%. An in- 
ternal standard of pentadecanol was added to correct for 
differences in derivatization efficiency and changes in 
sample volume during heating. Peaks were quantified by 
area integration and concentrations were normalized to 
the quantity of the internal standard recovered. Two 
technical replicates were analyzed for three biological 
samples from each flowering stage. 

HPLC Profiling 

The dried flowers were separately comminuted with a 
miller. Each solid sample (40 mesh, 0.50 g) was 



accurately weighed and extracted with 50 ml of 70% 
aqueous ethanol by ultrasonication for 30 minutes. The 
extract was cooled to 25°C and diluted to 50 ml with 
70% aqueous ethanol, filtered with a 0.45 um Millipore 
filter membrane. Then, 10 ul of the filtrate was injected 
into the HPLC system for analysis (Additional file 1: 
Figure S2). 

The HPLC system was an Agilent 1200LC series (Agi- 
lent Technologies, Palo Alto CA, USA), consisting of an 
online vacuum degasser (G1379B), a Bin pump SL 
(G1312B), an auto-sampler (GB67C), a thermostatic col- 
umn compartment (G1316B), and a diode-array detector 
(DAD) (G1315C) coupled with an analytical workstation. 
The column configuration was an Agilent TC-C18 
reserved-phase column (5 um, 250 mm x 4.6 mm). The 
sample injection volume was 10 ul. The detection wave- 
length was set at 242 nm for analysis with the flow rate 
at 1.0 ml minute" 1 , and the column temperature 
remaining at 25°C. The mobile phase contained deio- 
nized water, formic acid (A; 99:1, v/v), and methanol (B). 
The elution conditions are shown in Additional file 2: 
Table SI. To determine the linearity of the chromato- 
graphic techniques, calibration plots of eight standards 
were constructed on the basis of peak areas (y) using 
seven different concentration solutions (x). All plots 
were linear in the examined ranges, and linear ranges 
were shown at different concentrations for the standard 
compounds (ug ml" 1 ). The r value in Additional file 2: 
Table S2 refers to the correlation coefficient of the equa- 
tion for calculating the content of compound. All the 
standard compounds showed good linearity (r > 0.9994) 
in a relatively wide concentration range. The standard 
compounds, chlorogenic acid (110753), caffeic acid 
(110885), ferulic acid (110773), rutin (100080), huteolo- 
side (111720), Hyperoside (111521), quercitrin (111538), 
and quercetin (100081), were purchased from National 
Institutes for Food and Drug Control in China. 

Principal component analysis and statistical assessment 
of GC-MS and HPLC data 

Data sets containing more than two independent bio- 
logical replicates per samples were statistically analyzed 
based on the Students t test with a significance cutoff of 
p < 0.05. To assess the metabolic changes or differences 
between samples and to identify metabolic changes 
involved in group discrimination, multivariate analyses 
(PCA, PLS-DA) were performed by using the SIMCA- 
P + (12.0.0.0.0) program (Umetrics AB, Tvistevdgen 48 
Umea 907 19, Sweden). 

Phylogeny and identification of paralogs and orthologs 

We used the PFAM database [31] for validating all the 
gene families and protein sequences and constructed 
neighbor- joining trees for all sequences (ClustalW2). 
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To identify orthologs, we performed an all-against-all 
sequence comparison using BLAST (cutoff <10~ 20 ) and 
determined orthologs from the best reciprocal hits >80% 
alignment length; [32]. 

Experimental validation of transcribed sequences 

We used RNA samples extracted from the flower sam- 
ples of FLJ and rFLJ to perform qRT-PCR and M-MLV 
reverse transcriptase cDNA Synthesis Kit from Takara. 
The PCR primers are shown in Additional file 2: Table 
S5. The amplification condition was set as follows: 
initiated by 1 -minute incubation at 95°C, followed by 35 
cycles at 95°C for 15 seconds, 57-60°C for 30 seconds, 
and 68°C for 30 seconds. PCR results were evaluated by 
using 2-3% NuSieve agarose gels. 

Results 

Paired-end sequencing and de novo assembly 

We designed a paired-end sequencing strategy and 
acquired nearly-saturated raw sequence data for all five 
libraries, including FLJ bud, FLJ flowerl, FLJ flower2, 
rFLJ bud, and rFLJ flower2 in a range of 27-41 million 
reads per library (Table 1). After quality filtering and as- 
sembly, the usable sequence reads per library totaled 13- 
32 million reads. Given the read lengths of either 76 bp 
or 81 bp, the net transcriptome coverage is deemed 
adequate. 

We employed ABySS (http://www.bcgsc.ca/platform/ 
bioinfo/software/abyss; 20), an assembler developed spe- 
cifically for the next-generation short-read sequences, to 
assemble the processed sequences and obtained a total 
of 180,220 contigs, ranging from 25,232 to 41,796 for 
each library. We assembled all reads using the SOAP 
aligner tool [27], allowing up to two base mismatches. 
About half (51,607,107 reads) of the total reads are 
mapped to the contigs, and 49,821,911 reads remain un- 
mapped. Specifically, 11,434,981 (46.01%) reads are 
mapped to the contigs in the rFLJ bud; 8,202,791 
(56.30%) to rFLJ flower2; 17,927,893 (56.27%) to FLJ 
bud; 8,943,545 (54.21%) to FLJ flower2; and 4,697,897 
(36.30%) to FLJ flowerl. The average contig lengths are 
less than 1,000 bp, but the N50 contig sizes are over 
1,000 bp for all libraries. 



Gene annotation and expression analysis 

We used the available public information of plant genes 
and genomes for annotation and performed a similarity 
search against the Genbank non-redundant protein data- 
base (Genbank nr) using the BLASTx algorithm [21,33] 
with an E-value threshold of 10~ 5 and a size threshold 
>100 bp. We have 119,965 contigs (66.64%) shown sig- 
nificant similarity to known proteins based on 45,549 
unique proteins. Based on the BLAST search, 86% of the 
contigs show similarities in the six plant species, includ- 
ing Vitis vinifera, Ricinus communis, Populus tricho- 
carpa, Arabidopsis lyrata, Glycine max, and Nicotiana 
tabacum (Additional file 1: Figure S3), and the fractions 
of sequences that match to what in V. vinifera are more 
than 50% for all five libraries. Due to the absence of gen- 
ome information for FLJ, the full-length cDNA set of V. 
vinifera (RefSeq project by NCBI) served as the best 
reference for clustering and combining FLJ and rFLJ 
data. Moreover, our results indicate that the proportion 
of the sequences with matches in the Genbank nr data- 
base is greater among the longer contigs. For instance, 
we observed 98.6% matching efficiency for the sequences 
longer than 2,000 bp but it decreased to 50.8% when the 
sequence lengths dropped to 100 to 500 bp. The match- 
ing efficiencies for the sequences ranging in 500- 
1,000 bp, 1,000-1,500 bp, and 1,500-2,000 bp, are 
90.5%, 96.6%, and 98.2%, respectively. 

We defined the FLJ/rFLJ genes using LASTZ [29] and 
V. vinifera full-cDNAs (15,674 K vinifera genes and E 
value of 10~ 10 ) as the reference. Fragmented genes were 
also identified and joined as ESTs. The FLJ/rFLJ tran- 
scriptomes were defined based on the criterion: at least 
one contig mapped to a reference gene. Nearly 30% of 
the total reference genes have matches to the FLJ/rFLJ 
contigs. Finally, we have 5,480, 5,310, 5,818, and 5,131 
unigenes identified in rFLJ bud, rFLJ flower2, FLJ bud, 
and FLJ flower2, respectively. Only the FLJ flowerlli- 
brary (4,483 genes) has less than 5,000 unigenes 
identified. 

Functional analysis 

We carried out functional and pathway analyses using 
the Kyoto Encyclopedia of Genes and Genomes (KEGG; 



Table 1 Summary of sequencing and assembling data 



Samples 


Reads 




Average Read 


Number 


Average contigs 


N50 Of 


Mapped 


Expressed 




Raw 


Processed 


length (bp) 


of contigs 


length (bp) 


contigs 


reads 


Genes 


FLJ B 


32265956 


24849381 


76 


32171 


763 


1394 


11434981 


6218 


FLJ F1 


30007720 


15279615 


81 


41794 


412 


681 


8602791 


6088 


FLJ F2 


41392696 


31858918 


76 


41608 


544 


1077 


1 7927893 


6591 


rFLJ B 


31021220 


16497776 


81 


25232 


692 


1214 


8943545 


5837 


rFLJ F2 


26778792 


12943328 


81 


39415 


552 


1093 


4697897 


5330 



Abbreviations: FLJ, Lonicera japonica Thunb; rFLJ, Lonicera joponica Thunb. var. chinensis (Watts.); B, bud; F1, flowerl; and F2, flower2. 
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http://www.genome.jp/kaas-bin/kaas_main?mode=est_b;22), 
and 180,020 sequences with significant matches were 
assigned to 276 KEGG pathways. Of the total, 21,692 
unigenes have enzyme commission (EC) numbers 
(Additional file 2: Table S3). We attempted to map major 
compounds that are involved in the biosynthesis of 
phenylalanine, terpenoid backbone, and fatty acid to the 
citric acid cycle, glycolysis, and sucrose metabolic path- 
ways based on sequence homologies to the known plant 
genes (Additional file 1: Figure S4). We categorized a 
total of 1,321 unigenes involved in the biosynthetic path- 
ways. In addition to genes assigned to metabolic path- 
ways, 18,417 unigenes are attributable to functions in 
genetic information processing, membrane transport, 
signal transduction, immune system, and environmental 
adaptation. These results demonstrated the power of 
high-throughput sequencing in identifying novel genes in 
non-model organisms, providing a valuable resource for 
investigating specific processes, functions, and pathways 
involved in active compound formation and their accu- 
mulation in FLJ flowers. 

Analysis of differentially-expressed genes (DEGs) 

We calculated gene expression level based on unique 
read counts and RPKM values (reads per kilobase of 
exon model per million mapped reads) for each contig 
and ESTs (the length of the ESTs was the exon length 
for the contigs). Two methods were used to define 
DEGs: DEGseq software based on p value estimates and 
2-fold RPKM differences between the two libraries as a 
threshold for each gene. The numbers of DEGs between 
different datasets are: (1) 2,316 between the FLJ bud and 
flowerl, (2) 1,713 between the FLJ bud and flower2, (3) 
1,163 between FLJ flowerl and flower2, (4) 666 between 
rFLJ and FLJ bud, and (5) 692 between rFLJ and FLJ 
flower2. 

We identified 262 (26%) up-regulated and 663 (67%) 
down-regulated genes in the flower bud in comparison 
with flowerl and flower2 in FLJ (the group 1 compari- 
son; Additional file 1: Figure S5). These DEGs are con- 
centrated on certain pathways, such as biosynthesis of 
plant hormones (41 DEGs), biosynthesis of terpenoids 
and steroids (28 DEGs), ribosomes (25 DEGs), biosyn- 
thesis of phenylpropanoids (24 DEGs), and biosynthesis 
of alkaloids derived from terpenoids and polyketides (21 
DEGs). Up-regulated genes are absent in certain path- 
ways, such as fatty acid and unsaturated biosyntheses. 
The total number of down-regulated genes in the three 
fatty acid-related pathways is 12, which is higher than 
that found in other pathways without up-regulated genes 
(Additional file 1: Figure S6). 

One of the down-regulated DEGs in pathogenesis 
(G 0:0009405) is glyceraldehyde-3-phosphate dehydro- 
genase (G3PD, EG1.2.1.12; Additional file 1: Figure S7). 



G3PD catalyzes the conversion of glyceraldehyde-3- 
phosphate to 1,3-bisphosphoglycerate in glycolysis and 
plays a critical role in the control of plant metabolism and 
development [34]. Munoz-Bertomeu reported [35] that 
the expression and catalytic activity of G3PD in anthers 
are necessary for mature pollen development of Arabi- 
dopsis. Exine formation in developing pollen exhibits an 
intricate pattern, primarily comprised of a polymer of 
fatty acids and phenolic compounds [36]. We show here 
that the transcription level of sc_FLJ_007660 is 9.57-fold 
and 27.54-fold higher in the group 1 comparison. 

The up-regulated DEGs are involved in transport 
(GO:0006810), transmembrane transporter activity 
(GO:0022857), and substrate-specific transporter activity 
(GO:0022892). One of the DEGs involved in localization 
(GO:0051234) is a regulator of Vps4 activity in the MVB 
protein pathway and related to pollen tube growth [37] 
(Additional file 1: Figure S7). Rapid pollen tube growth 
requires a high rate of sugar metabolism to meet ener- 
getic and biosynthetic demands [38]. The transcription 
level of sc_FLJ_015256 is 0.35-fold and 0.43-fold in the 
group 1 comparison and the result suggests a distinct 
sucrose metabolism. 

DEGs and their related metabolic pathways 

We further investigated several metabolic pathways and 
selected several representative pathways for more detailed 
analyses, including phenolic acids, terpenoid and fatty 
acid metabolism, glycolysis, and TCA cycles (Figure 2 
and Additional file 1: Figure S8). 

Phenolic acid pathway 

Most phenolic compounds are derived from the phe- 
nylpropanoid pathway (Figure 3). L- Phenylalanine is 
first catalyzed to trans -cinnamic acid by phenylalanine 
ammonia-lyase (PAL, EC:4.3.1.24), and subsequently to 
p-coumaric acid by trans-cinnamate 4-hydroxylase (C4H). 
P-coumaric acid can be converted into caffeic acid by p- 
coumarate 3-hydroxylase (C3H) or to p-coumaroyl-CoA 
by 4-coumarate-CoA ligase (4CL, EC:6.2.1.12). The four 
known products of the p-coumaroyl-CoA-related reac- 
tions are as follows: (1) caffeoyl-CoA O-methyltransferase 
(COMT, EQ2.1.1.104) for feruloyl-CoA, (2) cinnamyl- 
alcohol dehydrogenase (EC: 1.1. 1.195) for p-coumaryl al- 
cohol, (3) shikimate O-hydroxycinnamoyltransferase 
(HCT) for p-coumaroyl quinic acid and chlorogenic 
acid, and (4) chalcone synthase (CHS, EG2.3.1.74) 
and chalcone isomerase (CHI, EQ5.5.1.6) for narin- 
genin. In addition, naringenin can be catalyzed to form 
dihydrokaempferol by dihydroflavonol -4-reductase 
(DFR, EC:1. 1.1.219), to become apiforol by flavanone 
4-reductase (F4R) or a DFR homolog, or to become 
eriodictyol by flavonoid 3 '-hydroxylase (F3'H). Eriodic- 
tyol can be further catalyzed to become luteoforol by a 
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Figure 2 Metabolic profile differences between flower buds of FLJ and rFU as detected based on PCA of GC-MS and HPLC data. (A) 

PCA of metabolic profiles based on GC-MS analysis (n = 3). (B) PCA of metabolic profiles based on HPLC analysis (n = 5). Y1-Y5, FLJ samples; R1-R5, 
rFLJ samples. Distinct metabolic profiles that correspond to a particular species are circled in (A) and (B). PCA analyses were performed by using 
the SIMCA-P + (12.0.0.0.0) program (Umetrics AB, Tvistevdgen 48 Umea 907 19, Sweden), p [1] is the first principal component, and p [2] the 
second principal component. 



homolog of DFR (EGLLL219) and to be luteolin by 
flavones synthase (FNS). Dihydrokaempferol can also 
be further converted into ds-3,4-leucopelargonidin by 
flavanone 3|3-hydroxylase (F3H), and eventually to 
quercetin and rutin by flavonol synthase (FLS). 

In the group 1 comparison, we found that the tran- 
scription levels of PAL, EG2.3.1.74, EG5.5.1.6, and 
EC: 1.1. 1.195 are down- regulated, whereas EG2.1. 1.104 
and EC:1. 1.1.219 are up-regulated in FLJ buds (Figure 2). 
In the group 2 comparison, the transcription levels of 
seven DEGs (EQl.14.13.il, EG2.1.1.104, EQ6.2.1.12, 
EG3.2.1.21, EQ4.3.1.24, EG1.14.11.9, and EG2.3.1.74) 
involved in phenylpropanoid and flavonoid biosynthesis 
are down-regulated. Our data strongly support the cor- 
relation between metabolic pathways and their related 
gene expressions (Figure 4). 

Terpenoid pathway 

In the group 1 comparison, we obtained five DEGs 
in terpenoid backbone biosynthesis. The transcription 
levels of 3-hydroxy-3-methylglutaryl-CoA reductase 
(HMGR, [EGl. 1.1.34]) in the mevalonate pathway, 
and 4-diphosphocytidyl-2-C-methyl-D-erythritol kin- 
ase [EC:2.7. 1.148], isopentenyl-diphosphate delta- 



isomerase [EG5.3.3.2], (E)-4-hydroxy-3-methylbut- 
2-enyl-diphosphate synthase [EGl. 17.7.1], and 4- 
hydroxy-3-methylbut-2-enyl diphosphate reductase 
[EC:1. 17.1.2] in the plastid MEP/DXP pathway are all 
down-regulated. Furthermore, the transcription levels 
of six DEGs (EGl. 1.1.34, EG2.7.1.148, EG2.2.1.7, 
EC:5.3.3.2, EGl. 17.1.2) are down-regulated but farne- 
syl diphosphate synthase [EC:2.5.1.1 2.5.1.10] is up- 
regulated. The combination of the different trends results 
in an increase in germacrene D content in FLJ buds. 

Fatty acid pathway 

The key enzymes involved in fatty acid metabolism, 
aldehyde dehydrogenase [EC:1.2.1.3,1 1.2.1.8], acyl-CoA 
oxidase [EC: 1.3.3.6], and long-chain acyl-CoA synthe- 
tase [EG6.2.1.3], are all down-regulated in the group 1 
comparison. Other DEGs, such as 3R-hydroxymyristoyl 
ACP dehydrase [EG4.2.L-], acyl-ACP desaturase 
[EG1.14.19.2], fatty acyl-ACP thioesterase B [EG3.1.2.14 
3.1.2.-], acyl-CoA oxidase [EG1.3.3.6], and omega-3 
fatty acid desaturase (delta- 15 desaturase) [EC: 1.14. 19.-], 
involved in the biosynthesis of fatty acids and unsatur- 
ated fatty acids, are also down-regulated. In addition, 
five DEGs (EG1.14.19.-, EG1.3.L-, EG3.1.2.14/3.1.2.-) 
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Figure 3 The active compound contents in flowering samples of FLJ and rFLJ. (A) The content of eight compounds: chlorogenic acid, 
caffeic acid, ferulic acid, rutin, luteoloside, hyperoside, quercitrin and quercetin (mg/g DW), as analyzed by using HPLC and calculated based on a 
linear formula (Additional file 2: Table S2). Error bars (SEs; n = 5) were calculated by using Excel software. (B) The content of volatile compounds 
(Metabolite/pentadecanol [IS] peak area ratio/1000) in FLJ and rFLJ. The Y axis indicates the relative quantification of metabolites by normalization 
of their response values to pentadecanol. 



involved in fatty acid and unsaturated fatty acid biosyn- 
thesis are also down- regulated in the group 2 compari- 
son. Our expression data again are in complete 
agreement with the metabolic data (Figure 4). 



Primary/central metabolism 

Our group 1 comparison results lead to the identifica- 
tions of down-regulated DEGs: eight (EC: 1.1. 1.37, 
EQ1.2.4.1, EQ1.3.99.1, EQ1.8.1.4, EG2.3.3.1, EQ2.3.3.8, 
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Figure 4 Pathway of active compound biosynthesis in the flowering samples of FLJ. Circles represent compounds and six different colors 
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colors indicate RPKM values of the ESTs. Three-fold circles and three-row squares display the content of compounds and gene express levels. 
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EG6.2.L4/6.2.L5) in the citrate cycle, eleven (EGLl.l.l, 
EGL2.L12, EGL2A1, EG1.8.1.4, EG4.2.1.11, EG5.3.1.1, 
EG5.4.2.2, EG1.2.1.31/L2.L8) in glycolysis/gluconeogen- 
esis, and five (EG1.1.1.22, EG2.4.1.13, EG2.7.7.27, 
EG3.2.1.2 and EG5.4.2.2) in starch and sucrose metab- 
olism. In addition, we found five up-regulated genes in 
the group 1 comparison: succinate dehydrogenase (ubi- 
quinone) iron-sulfur protein [EG 1.3.5.1], pyruvate kin- 
ase [EG2.7.1.40], a, a-trehalose-phosphate synthase 
(UDP-forming) [EG2.4.1.15], 4-a-glucanotransferase 
[EG2.4.1.25], and beta-amylase [EG3.2.1.2]. In the 
group 2 comparison, we defined five down- regulated 
genes (EG1.1.1.37, EG1.8.1.4, EG2.3.3.8, EG4.1.2.13, 
EG3.2.1.2, and EG3.2.1.21) and they are involved in 
the citrate cycle, glycolysis/gluconeogenesis, and starch 
and sucrose metabolism. In contrast, we have eight up- 
regulated genes: EG1.8.1.4, EG4.2.1.11, EG5.1.3.3, 
EG5.4.2.2, EG6.2.1.1, EG1.1.1.22, EG2.7.7.27, EG5.4.2.2, 
and EG2.7.1.40. 

We also validated some less-abundant DEGs as repre- 
sentatives for most of the above-mentioned pathways 



using qRT-PCRs, and the results are consistent with the 
RNA-seq data (Additional file 2: Table S5). 

Integrated analysis of active compound abundance and 
gene expression 

According to a previous report, the content of chloro- 
genic acid in the FLJ flower bud is higher than that of 
the flowers [10]. Our study shows that the contents of 
chlorogenic acid and four other active compounds (caf- 
feic acid, ferulic acid, luteoloside, and quercitrin) are ac- 
tually lower, aside from three others (isopropyl laurate, 
linalool and germacrene D) that remain unchanged in 
the three different flowering stages (Figure 5A). We also 
found that related enzymes PAL, CHS, and CHI are up- 
regulated in the group 1 comparison (Figure 3). These 
results indicate that there are reduced expressions of the 
key genes involved in phenolic compound skeleton bio- 
synthesis, leading to changes in the FLJ active com- 
pounds during flowering, which provide additional 
information for the medicinal value of the flower buds. 
We further found that the content of isopropyl laurate 
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Figure 5 Gene transcription level in flower buds of FLJ and rFLJ. The square represents gene express levels, and the nine colors indicate 
the RPKM values of the ESTs as calculated according to the grape full-length cDNA sequences. * denotes no differential expression between FLJ 
and rFLJ. 



has a similar trend as what we observed in fatty acid 
biosynthesis. 

The MEP/DXP and MVA pathways are thought to be 
interdependent during the biosynthesis of isoprenoids 
[39]. Sesquiterpene is synthesized by the MVA pathway 
during the isoprenoid biosynthetic system in plants and 
the HMGR multi-gene family catalyzes the synthesis of 
mevalonate, which is a precursor in this pathway [40]. 
The inducible HMGR enzyme activity is required for 
sesquiterpene accumulation in tobacco cell suspension 
cultures [41]. The transcription level of HMGR has no 
effect and farnesyl diphosphate sythase (FPS) is only 
transiently induced after emission of (-) -germacrene D 
in poplar [42]. However, our results suggest that the 
content of sesquiterpene may be negatively controlled by 
HMGR and the up-regulation of FPS may increase the 
production of germacrene D. 



ferulic acid, luteoloside, and quercitrin after flowering 
(Figure 5A). Our GC-MS analysis also indicates variable 
fatty acid and terpenoid contents in the flowering stages 
(Table 2). In addition, we observed the decreased con- 
tent of isopropyl laurate and the increased contents of 
linalool and germacrene D in the flowering process 
(Figure 5B). Isopropyl laurate, linalool, and germacrene 
D are all known major volatile chemicals released in 
full-bloomed flowers [43]. 

We also carried out PCA analysis on selected organic 
compounds; phenolic acids, fatty acids, and terpenoids 
are all detected by using GC-MS and HPLC. These com- 
pounds form two independent groups (Figure 6), where 
the contents of chlorogenic acid, luteoloside, quercitin, 
and isopropyl laurate are higher overall in the rFLJ 
flower buds as compared to those of FLJ but the germa- 
crene D content is lower in rFLJ. 



Analysis of active compounds in FLJ and rFLJ 

To gain additional insight into the active compounds 
and metabolic pathway maps in FLJ and rFLJ, we gener- 
ated metabolic profiles of the active compounds from 
different tissues using HPLC (Figure 5) and observed 
reduced production of chlorogenic acid, caffeic acid, 



Interactions of secondary and primary/central 
metabolisms 

The production of secondary metabolites is tightly asso- 
ciated with pathways of primary/central metabolism, 
such as glycolysis, the shikimate pathway, the production 
of aromatic amino acids, and aliphatic amino acids [44]. 
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Table 2 GC-MS analysis of volatile compounds 



Peak 


RT 


rFLJ 




FLJ 






Compounds 


No. 
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F1 


F2 


B 


F1 


F2 




7 


9.767 














cis-3-hexenyl acetate 


8 


10.767 














2-Ethylhexanol 


10 


13.908 














Linalool 


12 


16.092 














Octyl acetate 


17 


31.292 














Germacrene-D 


19 


40.925 














Isopropyl laurate 


22 


44.417 














Farnesol 


26 


45.956 














Pentadecanol (IS) 


28 


46.950 














1-Octadecyne 


30 


47.450 


* 


-X- 


■X 


■X 


■X 


■X 


Isobutyl O-phthalate 


31 


48.117 


* 


* 




* 


* 




2-Nonadecanone 


34 


48.917 


* 


* 


* 


* 


* 


* 


Methyl glycol phthalate 


35 


51.217 






■X 








Nerolidol acetate 


36 


51.617 






■X 






■X 


Farnesol isomer 


39 


52.717 


* 












Normal-docosane 


40 


55.2 00 






* 




* 


* 


Didecyl ether 


41 


55.7 00 




-X- 






■X 




Docosanoic acid, 
methyl ester 


44 


56.717 














Allyl stearate 


45 


58.867 










■X 


■X 


Heptacosane 


46 


58.950 














Arachic alcohol 


47 


59.65 0 




-X- 


■X 


■X 


■X 


■X 


Tetracosanoic acid, 
methyl ester 


52 


64.692 




■X 


■X 


■X 


■X 


■X 


Nonacosane 



Abbreviations: FLJ, Lonicera japonica Thunb; rFLJ, Lonicera japonica Thunb. var. 
chinensis (Watts.); B, bud; F1, flowerl; F2, flower2; RT, retention time; *, present 
in sample; and IS, internal standard. 



GC-MS and HPLC analysis detected three compound 
groups produced from primary metabolisms in FLJ and 
rFLJ: phenolic acids, terpenoids, and fatty acids (Add- 
itional file 1: Figure S4). 

Phosphoenolpyruvate metabolism 

Phosphoenolpyruvate (PEP) is an example of glycolysis 
intermediates. It is indispensable for energy metabolism 
in the cytosol and delivers ATP and pyruvate catalyzed 
by cytosolic pyruvate kinase (PK) [45]. Inside the plas- 
tids, PEP acts as a precursor for at least four metabolic 
pathways: fatty acids, isoprenoids, branched chain 
amino acids, and the shikimate pathway [46]. However, 
chloroplasts and most non-green plastids lack the abil- 
ity to produce PEP via glycolysis, because their enolase 
is either absent or has low activity [47]. In this study, 
we found that enolase (sc_FLJ_010870) transcription is 
1232.8-fold higher in the FLJ flower buds when com- 
pared with flowerl, and no detectable expression of the 
enzyme in flower2. Therefore, the formation of PEP is 
negatively correlated with FLJ flowering. In contrast, the 



transcription level of PK (sc_FLJ_000591), a competing 
enzyme, is 0.11-fold and 0.54-fold in the group 1 com- 
parison. We neither observed significant changes in the 
flavonoid content nor in ABA and fatty acids contents 
as what was reported in an enol mutant when com- 
pared with the wild type of A. thaliana [46]. We did 
found that the transcription level of enolase is up- 
regulated in buds as compared with the two flower 
stages, as well as in the group 2 comparison, but 
showed no correlation with the phenolic acid content. 
In contrast, the transcription level of PK is higher in 
the same comparison. Our combined analysis of bio- 
synthesis of phenolic acid, terpenoids, and fatty acids 
did indicate that there is a potential correlation be- 
tween PK expression level and the contents of flavo- 
noids, terpenoids, and fatty acids in FLJ. 

ATP function 

ATP synthesis could also have influences on the inter- 
action of primary and secondary metabolisms. We 
assessed the predicted phosphorylation sites of the rele- 
vant enzymes (Additional file 2: Table S4) and found that 
all are candidates for phosphorylation-based regulation. 
Of the defined DEGs, a study in animals showed that 
HMGR exists in both active (dephosphorylated) and 
inactive (phosphorylated) forms [48]. PK has also been 
shown to be more susceptible to inhibition by ATP 
[49]. We analyzed the differential expression of ATP 
synthesis-coupled electron transporters and found that 
this gene family is down-regulated in the group 1 
comparison and up-regulated in the group 2 compari- 
son. Prediction of phosphorylation sites from protein 
sequences indicates that HMGR (sc_FLJ_012288) has 
nine phosphorylation sites, whereas pyruvate kinase 
(sc_FLJ_000591) has four phosphorylation sites. 

Sugar regulation 

Previous studies on primary and secondary metabolisms 
have indicated that they are linked via the phenylalanine 
pool [50,51]. Therefore, it is possible that a reduced car- 
bon flux in the phenylpropanoid pathway may affect 
carbohydrate metabolism. 

As Matt et al. [52] showed that an increase in the 
sugar/amino acid ratio resulted in an elevated produc- 
tion of carbon-rich phenylpropanoids, we found that 
biosynthesis of phenylpropanoids is down-regulated in 
both group 1 and group 2 comparisons, and the tran- 
scription level of p-glucosidase that generates p-D- 
glucose and a-D-glucose is down- regulated in the group 
2 comparison. Furthermore, sucrose synthase that cre- 
ates sucrose is down-regulated in the group 1 compari- 
son but p-amylase that generates maltose is up- 
regulated in the group 1 comparison and down-regu- 
lated in the group 2 comparison. 
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Figure 6 Correlation between gene express level and active compound accumulation in FU. ESTs numbers and metabolites are shown in 
the key. M1-M8 are active compounds. The color key provides R values for the correlations calculated for the flower development datasets. The 
RPKM values of ESTs were calculated according to the grape full-length cDNA sequences. 



Moriizumi [53] reported that glucose-regulated tran- 
scription of pyruvate kinase is mediated by its glucose 
response element; the carbohydrate response elements 
are composed of two E box-like motifs separated by 
5 bp and is recognized by two basic helix-loop-helix/leu- 
cine zipper (bHLH/LZ) proteins [54,55]. In the group 2 
context, down-regulation of pyruvate kinase may be 
mediated by down- regulating glucose biosynthesis, and 
in the group 1 context, however, the transcription level 
of glucose biosynthetic enzymes is insignificant albeit 
up-regulated pyruvate kinase. Finally, two bHLH 
(sc_FLJ_008421 and sc_FLJ_006390) proteins are seen 
up-regulated and down-regulated in the group 1 and 2 
comparisons, respectively. 

A putative sequence for a carbohydrate-response- 
element binding protein (sc_FLJ_004075) is obtained 
based on sequence homology (a homolog in Norway 
rat; EMBLCDS:BAB77523 and in chicken GenBank: 
ABV72703.1). The transcript is up-regulated in the group 



1 comparison, suggesting that bHLH transcription fac- 
tors may be involved in regulating the response of PK to 
glucose in FLJ. 

The complexity of defining orthologs and paralogs for 
key metabolic pathways 

Since gene duplication is very common in plant gen- 
omes, we made an effort to differentiate orthologs and 
paralogs from all homologs. The overall sequence iden- 
tity between FLJ and rFLJ contigs is 99.0%. In our ana- 
lysis, we selected 55 DEGs sequences from all pathway- 
related genes, where six of the selected genes have 
slightly lower identities, about 97.4% (Additional file 2: 
Table S6). 

We identified the orthologs and paralogs of PAL, CHS, 
HMGR, and PK based on the genome sequences of Ara- 
bidopsis and grape (www.phytozome.net; Additional file 
2: Table S7) and built phylogenetic trees (Figure 7 and 
Additional file 1: Figure S9). First, between FLJ and rFLJ, 
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Figure 7 Phylogeny of predicted amino acid sequences and expression of the pyruvate kinase homologs between the FLJ and rFLJ 
flower buds and the pyruvate kinase family genes in Arabidopsis and grape. The phylogenetic tree was constructed based on the 
neighbor-joining method (ClustalW2). Pyruvate kinase homologs were identified based on the unique domain (PF00224) in the PFAM database. 



the PAL family genes clustered into two groups; one 
contains a pair of orthologs that have no detectable ex- 
pression in the flower buds and another has paralogs 
expressed at high levels. The total RPKM of PAL para- 
logs is 5.9-fold higher in rFLJ when compared to FLJ. 
Second, we have two pairs of CHS orthologs in both FLJ 
and rFLJ; the FLJ paralogs expressed at high levels, 17- 
fold higher than those in rFLJ. Third, in FLJ and rFLJ, 
HMGR genes are also clustered into two groups and 
each has two pairs of orthologs. The FLJ paralogs are 
expressed at low levels to the extent that they may be- 
come pseudogenes already. The total RPKM of HMGR 
paralogs is 1.6-fold higher in rFLJ than in FLJ. Finally, 



there are four groups of PK genes and the expressed PK 
paralogs primarily present in two of the four clusters 
(Cluster 3 and Cluster 4; Figure 7). 

Discussions 

Gene expression data provide more comprehensive 
understanding of FLJ as medicinal plant 

Although several studies have reported that the content 
of chlorogenic acid is higher in the flower buds as com- 
pared to that in flowers, it has long been disputed as to 
which organ has the highest medical value due to the 
applications of controversial evaluation methods. In this 
study, we obtained a sufficient amount of transcriptomic 
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data from both young buds and mature flowers, and car- 
ried out an integrated analysis on the variations of gene 
expression and the contents of active compounds. Our 
data from different flowering stages indicate that the 
contents of eight major active compounds either 
decreased (five of them) or remain unchanged (three of 
them) and that the biosynthesis of the active compounds 
is overall higher in the buds than in the flowers. In 
addition, many key enzymes, such as PAL, CHS, and 
CHI, are up-regulated in the FLJ flower buds (Figure 4). 
PAL is a key enzyme in the synthesis of phenylpropanoid 
phytoalexins and other phenolics [56]. Previous data 
have shown that the PAL expression levels decline dur- 
ing flowering in Nicotiana [57], and the decrease en- 
zyme activity corresponds to the decreased major 
phenylpropanoid compounds. Our data add further sup- 
port for this notion, as we found that the CHS and CHI 
genes are also significantly down-regulated in the later 
flowering stages. CHS functions to produce flavonoid 
precursors, and CHI plays a major part in the cyclization 
reaction from chalcone to flavanone [58]. 

The quality of herbal medicine has been very difficult 
to control and to evaluate primarily because of the com- 
plexity and incomplete knowledge of the active medicinal 
compounds. The primary methods that have been used 
for quality evaluation of Chinese herbal medicines are 
chemical and pharmacological analyses. Chemical eva- 
luations showed that chlorogenic acid and luteoloside are 
two common active compounds found in FLJ. However, 
content and fingerprint analysis of one or more of these 
compounds are not indicative for the medicinal value of 
the plant, and our genomic approach provides a compre- 
hensive survey. Our study generated gene expression 
data for terpenoids and fatty acid biosynthesis and 
increased valuable knowledge on other FLJ compounds. 

Both fatty and phenolic acid biosyntheses may depend 
on the interaction between the two pathways [59,60]. In 
FLJ, the transcription level of long-chain acyl-CoA 
synthetase (sc_FLJ_0 11431) and 4-coumarate-CoA ligase 
(sc_FLJ_001317) decreases after flowering. Since the 
contents of phenolic compounds and fatty acids are 
down-regulated in both the group 1 and 2 comparisons, 
some of the biosynthetic genes for phenypropanoid 
compounds and fatty acids may be regulated in a similar 
way or function as similar enzymes (such as sharing 
catalytic domains). 

Changes in phenolic acids, fatty acid biosynthesis, and 
the MEP/DXP pathway show a decreased trend in gene 
expression from the medicinal organ (buds) to the non- 
medicinal organs (flower 1 and flower2). In addition, we 
used phylogenetic tools for paralog analysis and revealed 
that the mRNA levels of these enzymes have higher expres- 
sions in the flower buds of both FLJ and rFLJ. We found 
that a total of eighteen enzymes in these biosynthetic 



networks are differentially expressed (Additional file 2: 
Table S9). 

Enzymatic divergence of orthologs and paralogs results 
in gene function variation and active compound content 

Secondary metabolism varies intensively, even between 
two closely related taxonomic groups, and the underlin- 
ing functional variations, such as enzyme activities, often 
lead to the production of unique compounds. In this 
study, we identified several orthologous enzymes as well 
as their related paralogs and evaluated their evolutionary 
relatedness between FLJ and rFLJ. We observed that the 
RPKM values of PAL, CHS, and HMGR are higher in 
rFLJ than FLJ, and these variations in expression may 
lead to alternations in the active compound contents be- 
tween in the two plants. 

In contrast to specific evolutionary changes to individ- 
ual enzymes during speciation, most of the functional 
variations appear to be related to gene or genome dupli- 
cations [61]. After gene duplication events, most para- 
logs are lost over time, but those have survived often 
gain new functions (neo-functionalization), partition the 
original function into different time and tissues (subfunc- 
tionalization), or have lost their functions (nonfunctiona- 
lization). Our data indicate that gene duplication in FLJ 
may provide an opportunity for neo-functionalization, 
whereby the PK gene and its orthologs and paralogs may 
evolve to have complementary enzyme functions. For in- 
stance, an FLJ PK paralog turns out to have a high RPKM 
value and may undergo neofunctionalization, whereas 
the expression of another paralog in FLJ is higher than 
that of rFLJ. There is a third PK paralog whose expres- 
sion levels are quite similar in both plants. Nevertheless, 
the divergence of gene expression due to duplicated 
genes appears to play direct roles in the production of 
active compounds in FLJ and rFLJ. 

Conclusion 

We used a comparative approach to address whether 
transcriptomes can be informative for the analysis of ac- 
tive medicinal compounds in herbal plants. Our study 
not only provided an initial description of the expression 
profiles of FLJ flowers, but also identified the enzyme 
pool that can be used to evaluate FLJ quality in future 
studies. We also associated metabolic pathways involved 
in processing active medicinal compounds to the expres- 
sions of their catalytic enzymes. We also used sequence 
evolution as a tool to identify orthologs and paralogs, as 
well as pathways for the biosynthesis of phenolic acid 
and its interactions with other pathways (Figure 8), and 
revealed that functional divergence of orthologs and 
paralogs may lead to variations in gene functions that 
control the active compound contents among different 
tissues and plants. 
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Figure 8 Key enzymes and proteins in regulating biosynthesis of phenolic acids in FLJ. Phenolic acids are produced from PEP by PAL, 4CL, 
and CHS. PK, 4CL, and HMGR are regulated by ATP. HMGR is related to phenolic acid and hyperoside. The biosynthesis of phenolic acids is 
coordinated closely with fatty acids. Converting glucose to PEP is regulated by bHLH. Abbreviations: CHS, chalcone synthase; 4CL, 4-coumarate- 
CoA ligase; PAL, phenylalanine ammonia-lyase; PK, pyruvate kinase; HMGR, 3-hydroxy-3-methylglutaryl-CoA reductase; FatB, fatty acyl-ACP 
thioesterase B; PEP, phosphoenolpyruvate; and Pyr, pyruvate. 
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Additional file 1: Figure SI. Comparing volatile compound of FU 
and rFLJ using Gas chromatography-mass spectrometry. A, Bud, B, 
flowerl, C, flower2. Black line represent FLJ and Red line represent rFLJ. 
FLJ, Lonicera japonica Thumb; rFLJ, Lonicera japonica Thunb. var. chinensis 
(Wats ). Figure S2. HPLC analysis of active compounds in FLJ and 
rFLJ. A, Mix of standard compounds. Chlorogenic acid (RT, 13.20; MW, 
354.31); Caffeic acid (RT, 17.39; MW, 180.15); ferulic acid (RT, 24.24; MW, 
194.18); Rutin (RT, 24.24; MW, 610.52); Luteoloside(RT, 25.27; MW,448.4); 
Hyperoside (RT, 25.35; MW, 464.37); Quercitrin (RT, 28.65; MW, 448.38); 
Quercetin (RT, 38.04; MW, 302.24). B, buds of rFLJ; C, buds of FLJ. FLJ, 
Lonicera japonica Thumb; rFLJ, Lonicera japonica Thunb. var. chinensis 
(Wats ). Figure S3. BlastX analysis result of contigs in FU and rFU 
with all non-redundant (NR) database in Genebank. Six species,Vitis 
vinifera, Ricinus communis, Populus trichocarpa, Glycine max, Arabidopsis 
lyrata, Nicotiana tabacum has highest identity with FLJ and rFLJ bud. E- 
value cut-off was lower than 1e~ 5 . FLJ, Lonicera japonica Thumb; rFLJ, 
Lonicera japonica Thunb. var. chinensis (Wats.). Figure S4. Pathways of 
major chemical compounds in Lonicera japonica Thumb (FLJ). All of 
contigs from three FLJ libraries were annotated with KEGG database. The 
six pathways, phenylalanine metabolism, terpenoid backbone, fatty acid 
biosynthesis, citric acid cycle, glycolysis and sucrose metabolism were 
analysis. Green rectangles repress enzymes finding in FLJ transcriptome. 
Figure S5. Analysis of gene differential express using MA-plot- 
based method. M is the Y -axis and represents the intensity ratio, and A 
is the X-axis and represents the average intensity for each transcript. The 
red points are the genes identity as differentially expressed. FLJ, Lonicera 
japonica Thumb; rFLJ, Lonicera japonica Thunb. var. chinensis (Wats.). 
Figure S6 Pathway assignment based on KEGG analysis of 
differential express genes between buds and other two flower 
developmental periods of Lonicera japonica Thumb (Groupl). A, 
Number of contig with down-regulated and up-regulated differential 
express genes. B, Number of contig with only up-regulated differential 
express genes. Figure S7. Gene Ontology classification of differential 
express genes. The results are summarized in three main categories: 
Biological process, Cellular component and Molecular function. FLJ, 
Lonicera japonica Thumb; rFLJ, Lonicera japonica Thunb. var. chinensis 
(Wats.). A, Gene Ontology classification of differential express genes 
between buds and two other flower developmental period of FLJ. Bud 
and flowerl, differential express genes between bud and flowerl; Bud 
and flower2, differential express genes between bud and flower2; 



up-regulated, up-regulated express genes in both in between bud and 
flowerl and in between bud and flower2; down-regulated, down- 
regulated express genes in both in between bud and flowerl and in 
between bud and flower2. B, Gene Ontology classification of differential 
express genes between buds of FLJ and rFLJ.Group2, differential express 
genes between buds of FLJ and rFLJ. up-regulated, up-regulated express 
genes; down-regulated, down-regulated express genes. Figure S8. Gene 
express level in bud, flowerl and flower2 of Lonicera japonica 
Thumb. Square represents gene express level and nine kinds of color 
indicate rpkm of scaffolds. B, bud;F1, flowerl; F2, flower2. rpkm of 
scaffolds was calculated according to Grape full-length cDNA sequence. 
Figure S9. Phylogenetic analysis of the predicted amino acids 
sequences and expression level of the Phenylalanine ammonia- 
lyase (PAL), 3-hydroxy-3-methylglutaryl-CoA reductase (HMGR) and 
chalcone synthase (CHS) homologues among the buds of FU and 
rFLJ and PAL,HMGR and CHS family genes in Arabidopsis and 
Grape. The phylogenetic tree was constructed by the neighborjoining 
method using ClustalW2. Identification of PAL,HMGR and CHS 
homologues was by searching the domain(PF00221, PF00368 and 
PF00195, respectively) in PFAM database. FLJ, Lonicera japonica Thumb; 
rFLJ, Lonicera japonica Thunb. var. chinensis (Wats). 

Additional file 2: Table SI. The Elution Conditions of HPLC Analysis. 

Note: T, Retention time; A, mobile phase deionized water- formic acid 
(99:1, v/v); B, mobile phase methanol. Table S2. Formula of Active 
Compound Content. Note: Calibration plots of eight standards were 
constructed on the basis of peak areas (y) using seven different 
concentration solutions (x). All plots were linear in the examined ranges, 
and linear ranges had been shown as the concentration of the standard 
compounds (pig mL-1). The r referred to the correlation coefficient of the 
equation. The standard compounds Chlorogenic acid (1 10753), Caffeic 
acid(1 1 0885), ferulic acid(1 1 0773), Rutin(1 00080), Luteoloside(1 1 1 720), 
HyperosideO 1 1521), Quercitrin(1 1 1538) and Quercetin (100081) were 
purchased from National institutes for food and drug control, China. 
Table S3. Number of Contigs in KEGG Pathways. Note: FLJ, Lonicera 
japonica Thumb; rFLJ, Lonicera japonica Thunb. var. chinensis (Wats.); B, 
bud; F1, flowerl; F2, flower2. Table S4. Predicted Phosphorylated Sites 
in 34 Protein Sequence from Differential Express Gene. Note: 
Predicted Phosphorylated Sites using software online (http://kinasephos2. 
mbc.nctu.edu.tw/) and protein sequence was perdicted by ORF finder 
(http://www.ncbi.nlm.nih.gov/gorf/gorf.html). Table S5. qRT-PCR and 
RNA-seq Analysis of Gene Express between Buds of Lonicera 
japonica Thunb. var. chinensis (Wats.) and Lonicera japonica Thumb. 
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Note: RR/YR, the ratio of transcripted level in buds of Lonicera japonica 
Thunb. var. chinensis (Wats.) and Lonicera japonica Thumb. Table S6. 
Orthologs Identity of Differential Express Genes Sequence between 
FLJ and rFLJ. Note: FU, Lonicera japonica Thumb; rFU, Lonicera japonica 
Thunb. var. chinensis (Wats.) Table S7 PAL, CHS, HMGR and PK Gene 
Families in Arabidopsis and Grape. Table S8. Putative Enzyme Pool 
to Control the Active Compounds in Buds of Lonicera japonica 
Thumb. 
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